Conversation
PanZezhong1725
commented
Apr 29, 2026
| @@ -0,0 +1,34 @@ | |||
| class InfinilmProcessor: | |||
Collaborator
Author
There was a problem hiding this comment.
这个文件是核心修改。
多模态模型引入之后,不同模型有不同的处理输入message的逻辑。
处理过程可抽象为三步:
- apply chat template:返回文本,注意这里的文本不能直接encode,而需要调用考虑多模态输入的process
- process:传入template好的prompt、所有图片视频等,返回processed_input(包含pytorch张量,hf功能限制导致)
- batch:将scheduler output中的所有request的processed_input整合成infinicore tensor的batch(比如加入continuous batching所需的输入)
Collaborator
Author
Collaborator
Author
pengcheng888
reviewed
May 9, 2026
| enable_graph_compiling=enable_graph, | ||
| attention_backend=attn_backend, | ||
| kv_cache_dtype=cfg.kv_cache_dtype, | ||
| model = LLM( |
Collaborator
There was a problem hiding this comment.
这样改后,离线推理单测的脚本, 也会走到了服务的 调度和cache管理 的流程么
Collaborator
There was a problem hiding this comment.
离线单测来说,要经过 调度、cache的队列、block分配。
更新pd分离服务后, 调度和 cache管理两部分中都会加上 kv_connecter的逻辑, 以及 forward前后也会有kv_connecter的代码逻辑。
这样对于jiuge.py的来说,经过的代码是不是太多了
wooway777
added a commit
that referenced
this pull request
May 9, 2026
issue/334 增加AutoInfinilmProcessor基建 #335
wooway777
approved these changes
May 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.





No description provided.